Multidimensional Chebyshev's inequality

From HandWiki

In probability theory, the multidimensional Chebyshev's inequality is a generalization of Chebyshev's inequality, which puts a bound on the probability of the event that a random variable differs from its expected value by more than a specified amount.

Let X be an N-dimensional random vector with expected value [math]\displaystyle{ \mu=\operatorname{E}[X] }[/math] and covariance matrix

[math]\displaystyle{ V=\operatorname{E} [(X - \mu) (X - \mu)^T]. \, }[/math]

If [math]\displaystyle{ V }[/math] is a positive-definite matrix, for any real number [math]\displaystyle{ t\gt 0 }[/math]:

[math]\displaystyle{ \Pr \left( \sqrt{( X-\mu)^T V^{-1} (X-\mu) } \gt t\right) \le \frac N {t^2} }[/math]

Proof

Since [math]\displaystyle{ V }[/math] is positive-definite, so is [math]\displaystyle{ V^{-1} }[/math]. Define the random variable

[math]\displaystyle{ y = (X-\mu)^T V^{-1} (X-\mu). }[/math]

Since [math]\displaystyle{ y }[/math] is positive, Markov's inequality holds:

[math]\displaystyle{ \Pr\left( \sqrt{(X-\mu)^T V^{-1} (X-\mu) } \gt t\right) = \Pr( \sqrt{y} \gt t) = \Pr(y \gt t^2) \le \frac{\operatorname{E}[y]}{t^2}. }[/math]

Finally,

[math]\displaystyle{ \begin{align} \operatorname{E}[y] &= \operatorname{E}[(X-\mu)^T V^{-1} (X-\mu)]\\[6pt] &=\operatorname{E}[ \operatorname{trace} ( V^{-1} (X-\mu) (X-\mu)^T )]\\[6pt] &= \operatorname{trace} ( V^{-1} V ) = N \end{align}. }[/math]